Performance improvement of connected digit recognition using unsupervised fast speaker adaptation
نویسندگان
چکیده
In this paper, we investigate unsupervised fast speaker adaptation based on eigenvoice to improve the performance of Korean connected digit recognition over the telephone channel. In addition, utterance verification is introduced into speaker adaptation to examine whether input utterance is appropriate to adaptation or not. Performance evaluation showed that the proposed method yielded performance improvements. We obtained 18%-22% string error reduction by the N-best-based fast speaker adaptation method with utterance verification using Support Vector Machine.
منابع مشابه
Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملSpeaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملOn-line unsupervised adaptation in speaker verification
This paper presents a new approach to on-line unsupervised adaptation in speaker verification. The approach extends previous work by (1) improving performance on the enrollment handset-type when adapting on a different handset-type (e.g., improving performance on cellular when adapting on a landline office phone), (2) accomplishing this cross channel improvement without increasing the size of t...
متن کاملRapid Speaker Adaptation using Regression-
In this paper, regression-tree based spectral peak alignment is proposed for rapid speaker adaptation using the linearization of VTLN. Two different regression classes are investigated: phonetic classes (using combined knowledge and data-driven techniques) and mixture classes. Compared to MLLR and VTLN, improved performance can be obtained for both supervised and unsupervised adaptations on bot...
متن کاملUnsupervised speaker adaptation using high confidence portion recognition results by multiple recognition systems
This paper describes an accurate unsupervised speaker adaptation method for lecture speech recognition using multiple LVCSRs. In an unsupervised speaker adaptation framework, the improvement of recognition performance by adapting acoustic models greatly depends on the accuracy of labels such as phonemes and syllables. Therefore, extraction of the adaptation data guided by the confidence measure...
متن کامل